{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Tce3stUlHN0L"
      },
      "source": [
        "##### Copyright 2020 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "tuOe1ymfHZPu"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qFdPvlXBOdUN"
      },
      "source": [
        "# 张量简介"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MfBg1C5NB3X0"
      },
      "source": [
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td><a target=\"_blank\" href=\"https://tensorflow.google.cn/guide/tensor\" class=\"\">     <img src=\"https://tensorflow.google.cn/images/tf_logo_32px.png\" class=\"\">     在 TensorFlow.org 上查看</a></td>\n",
        "  <td><a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/zh-cn/guide/tensor.ipynb\" class=\"\"><img src=\"https://tensorflow.google.cn/images/colab_logo_32px.png\" class=\"\">在 Google Colab 中运行 </a></td>\n",
        "  <td><a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/zh-cn/guide/tensor.ipynb\" class=\"\"><img src=\"https://tensorflow.google.cn/images/GitHub-Mark-32px.png\" class=\"\">在 GitHub 中查看源代码</a></td>\n",
        "  <td><a href=\"https://storage.googleapis.com/tensorflow_docs/docs-l10n/site/zh-cn/guide/tensor.ipynb\" class=\"\"><img src=\"https://tensorflow.google.cn/images/download_logo_32px.png\" class=\"\"> 下载笔记本</a></td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "AL2hzxorJiWy"
      },
      "outputs": [],
      "source": [
        "import tensorflow as tf\n",
        "import numpy as np"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VQ3s2J8Vgowq"
      },
      "source": [
        "张量是具有统一类型（称为 `dtype`）的多维数组。您可以在 `tf.dtypes.DType` 中查看所有支持的 `dtypes`。\n",
        "\n",
        "如果您熟悉 [NumPy](https://numpy.org/devdocs/user/quickstart.html)，就会知道张量与 `np.arrays` 有一定的相似性。\n",
        "\n",
        "就像 Python 数值和字符串一样，所有张量都是不可变的：永远无法更新张量的内容，只能创建新的张量。\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DRK5-9EpYbzG"
      },
      "source": [
        "## 基础知识\n",
        "\n",
        "我们来创建一些基本张量。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uSHRFT6LJbxq"
      },
      "source": [
        "下面是一个“标量”（或称“0 秩”张量）。标量包含单个值，但没有“轴”。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "d5JcgLFR6gHv"
      },
      "outputs": [],
      "source": [
        "# This will be an int32 tensor by default; see \"dtypes\" below.\n",
        "rank_0_tensor = tf.constant(4)\n",
        "print(rank_0_tensor)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tdmPAn9fWYs5"
      },
      "source": [
        "“向量”（或称“1 秩”张量）就像一个值的列表。向量有 1 个轴："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "oZos8o_R6oE7"
      },
      "outputs": [],
      "source": [
        "# Let's make this a float tensor.\n",
        "rank_1_tensor = tf.constant([2.0, 3.0, 4.0])\n",
        "print(rank_1_tensor)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "G3IJG-ug_H4u"
      },
      "source": [
        "“矩阵”（或称“2 秩”张量）有 2 个轴："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "cnOIA_xb6u0M"
      },
      "outputs": [],
      "source": [
        "# If we want to be specific, we can set the dtype (see below) at creation time\n",
        "rank_2_tensor = tf.constant([[1, 2],\n",
        "                             [3, 4],\n",
        "                             [5, 6]], dtype=tf.float16)\n",
        "print(rank_2_tensor)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "19m72qEPkfxi"
      },
      "source": [
        "<table>\n",
        "<tr>\n",
        "  <th>标量，形状：<code>[]</code>\n",
        "</th>\n",
        "  <th>向量，形状：<code>[3]</code>\n",
        "</th>\n",
        "  <th>矩阵，形状：<code>[3, 2]</code>\n",
        "</th>\n",
        "</tr>\n",
        "<tr>\n",
        "  <td>    <img alt=\"A scalar, the number 4\" src=\"images/tensor/scalar.png\">\n",
        "</td>\n",
        "  <td>    <img alt=\"The line with 3 sections, each one containing a number.\" src=\"images/tensor/vector.png\">\n",
        "</td>\n",
        "  <td>    <img alt=\"A 3x2 grid, with each cell containing a number.\" src=\"images/tensor/matrix.png\">\n",
        "</td>\n",
        "</tr>\n",
        "</table>\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fjFvzcn4_ehD"
      },
      "source": [
        "张量的轴可能更多，下面是一个包含 3 个轴的张量："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "sesW7gw6JkXy"
      },
      "outputs": [],
      "source": [
        "# There can be an arbitrary number of\n",
        "# axes (sometimes called \"dimensions\")\n",
        "rank_3_tensor = tf.constant([\n",
        "  [[0, 1, 2, 3, 4],\n",
        "   [5, 6, 7, 8, 9]],\n",
        "  [[10, 11, 12, 13, 14],\n",
        "   [15, 16, 17, 18, 19]],\n",
        "  [[20, 21, 22, 23, 24],\n",
        "   [25, 26, 27, 28, 29]],])\n",
        "                    \n",
        "print(rank_3_tensor)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rM2sTGIkoE3S"
      },
      "source": [
        "对于包含 2 个以上的轴的张量，您可以通过多种方式将其可视化。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NFiYfNMMhDgL"
      },
      "source": [
        "<table>\n",
        "<tr>\n",
        "  <th colspan=\"3\">3 轴张量，形状：<code>[3, 2, 5]</code>\n",
        "</th>\n",
        "</tr>\n",
        "<tr>\n",
        "</tr>\n",
        "<tr>\n",
        "  <td>    <img src=\"images/tensor/3-axis_numpy.png\">\n",
        "</td>\n",
        "  <td>    <img src=\"images/tensor/3-axis_front.png\">\n",
        "</td>\n",
        "  <td>    <img src=\"images/tensor/3-axis_block.png\">\n",
        "</td>\n",
        "</tr>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oWAc0U8OZwNb"
      },
      "source": [
        "通过使用 `np.array` 或 `tensor.numpy` 方法，您可以将张量转换为 NumPy 数组："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "J5u6_6ZYaS7B"
      },
      "outputs": [],
      "source": [
        "np.array(rank_2_tensor)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "c6Taz2gIaZeo"
      },
      "outputs": [],
      "source": [
        "rank_2_tensor.numpy()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hnz19F0ocEKD"
      },
      "source": [
        "张量通常包含浮点型和整型数据，但是还有许多其他数据类型，包括：\n",
        "\n",
        "- 复杂的数值\n",
        "- 字符串\n",
        "\n",
        "`tf.Tensor` 基类要求张量是“矩形”——也就是说，每个轴上的每一个元素大小相同。但是，张量有可以处理不同形状的特殊类型。\n",
        "\n",
        "- 不规则张量（参阅下文中的 [RaggedTensor](#ragged_tensors)）\n",
        "- 稀疏张量（参阅下文中的 [SparseTensor](#sparse_tensors)）"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SDC7OGeAIJr8"
      },
      "source": [
        "我们可以对张量执行基本数学运算，包括加法、逐元素乘法和矩阵乘法运算。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "-DTkjwDOIIDa"
      },
      "outputs": [],
      "source": [
        "a = tf.constant([[1, 2],\n",
        "                 [3, 4]])\n",
        "b = tf.constant([[1, 1],\n",
        "                 [1, 1]]) # Could have also said `tf.ones([2,2])`\n",
        "\n",
        "print(tf.add(a, b), \"\\n\")\n",
        "print(tf.multiply(a, b), \"\\n\")\n",
        "print(tf.matmul(a, b), \"\\n\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "2smoWeUz-N2q"
      },
      "outputs": [],
      "source": [
        "print(a + b, \"\\n\") # element-wise addition\n",
        "print(a * b, \"\\n\") # element-wise multiplication\n",
        "print(a @ b, \"\\n\") # matrix multiplication"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "S3_vIAl2JPVc"
      },
      "source": [
        "各种运算 (op) 都可以使用张量。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Gp4WUYzGIbnv"
      },
      "outputs": [],
      "source": [
        "c = tf.constant([[4.0, 5.0], [10.0, 1.0]])\n",
        "\n",
        "# Find the largest value\n",
        "print(tf.reduce_max(c))\n",
        "# Find the index of the largest value\n",
        "print(tf.argmax(c))\n",
        "# Compute the softmax\n",
        "print(tf.nn.softmax(c))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NvSAbowVVuRr"
      },
      "source": [
        "## 形状简介"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hkaBIqkTCcGY"
      },
      "source": [
        "张量有形状。下面是几个相关术语：\n",
        "\n",
        "- **形状**：张量的每个维度的长度（元素数量）。\n",
        "- **秩**：张量的维度数量。标量的秩为 0，向量的秩为 1，矩阵的秩为 2。\n",
        "- **轴**或**维度**：张量的一个特殊维度。\n",
        "- **大小**：张量的总项数，即乘积形状向量\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "E9L3-kCQq2f6"
      },
      "source": [
        "注：虽然您可能会看到“二维张量”之类的表述，但 2 秩张量通常并不是用来描述二维空间。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VFOyG2tn8LhW"
      },
      "source": [
        "张量和 `tf.TensorShape` 对象提供了方便的属性来访问："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "RyD3yewUKdnK"
      },
      "outputs": [],
      "source": [
        "rank_4_tensor = tf.zeros([3, 2, 4, 5])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oTZZW9ziq4og"
      },
      "source": [
        "<table>\n",
        "<tr>\n",
        "  <th colspan=\"2\">4 秩张量，形状：<code>[3, 2, 4, 5]</code>\n",
        "</th>\n",
        "</tr>\n",
        "<tr>\n",
        "  <td> <img alt=\"A tensor shape is like a vector.\" src=\"images/tensor/shape.png\">\n",
        "</td>\n",
        "<td> <img alt=\"A 4-axis tensor\" src=\"images/tensor/4-axis_block.png\">\n",
        "</td>\n",
        "  </tr>\n",
        "</table>\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "MHm9vSqogsBk"
      },
      "outputs": [],
      "source": [
        "print(\"Type of every element:\", rank_4_tensor.dtype)\n",
        "print(\"Number of dimensions:\", rank_4_tensor.ndim)\n",
        "print(\"Shape of tensor:\", rank_4_tensor.shape)\n",
        "print(\"Elements along axis 0 of tensor:\", rank_4_tensor.shape[0])\n",
        "print(\"Elements along the last axis of tensor:\", rank_4_tensor.shape[-1])\n",
        "print(\"Total number of elements (3*2*4*5): \", tf.size(rank_4_tensor).numpy())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bQmE_Vx5JilS"
      },
      "source": [
        "虽然通常用索引来指代轴，但是您始终要记住每个轴的含义。轴一般按照从全局到局部的顺序进行排序：首先是批次轴，随后是空间维度，最后是每个位置的特征。这样，在内存中，特征向量就会位于连续的区域。\n",
        "\n",
        "<table>\n",
        "<tr>\n",
        "<th>典型的轴顺序</th>\n",
        "</tr>\n",
        "<tr>\n",
        "    <td> <img alt=\"Keep track of what each axis is. A 4-axis tensor might be: Batch, Width, Height, Freatures\" src=\"images/tensor/shape2.png\">\n",
        "</td>\n",
        "</tr>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FlPoVvJS75Bb"
      },
      "source": [
        "## 索引"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "apOkCKqCZIZu"
      },
      "source": [
        "### 单轴索引\n",
        "\n",
        "TensorFlow 遵循标准 Python 索引规则（类似于[在 Python 中为列表或字符串编制索引](https://docs.python.org/3/tutorial/introduction.html#strings)）以及 NumPy 索引的基本规则。\n",
        "\n",
        "- 索引从 `0` 开始编制\n",
        "- 负索引表示按倒序编制索引\n",
        "- 冒号 `:` 用于切片 `start:stop:step`\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "SQ-CrJxLXTIM"
      },
      "outputs": [],
      "source": [
        "rank_1_tensor = tf.constant([0, 1, 1, 2, 3, 5, 8, 13, 21, 34])\n",
        "print(rank_1_tensor.numpy())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mQYYL56PXSak"
      },
      "source": [
        "使用标量编制索引会移除维度："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "n6tqHciOWMt5"
      },
      "outputs": [],
      "source": [
        "print(\"First:\", rank_1_tensor[0].numpy())\n",
        "print(\"Second:\", rank_1_tensor[1].numpy())\n",
        "print(\"Last:\", rank_1_tensor[-1].numpy())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qJLHU_a2XwpG"
      },
      "source": [
        "使用 `:` 切片编制索引会保留维度："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "giVPPcfQX-cu"
      },
      "outputs": [],
      "source": [
        "print(\"Everything:\", rank_1_tensor[:].numpy())\n",
        "print(\"Before 4:\", rank_1_tensor[:4].numpy())\n",
        "print(\"From 4 to the end:\", rank_1_tensor[4:].numpy())\n",
        "print(\"From 2, before 7:\", rank_1_tensor[2:7].numpy())\n",
        "print(\"Every other item:\", rank_1_tensor[::2].numpy())\n",
        "print(\"Reversed:\", rank_1_tensor[::-1].numpy())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "elDSxXi7X-Bh"
      },
      "source": [
        "### 多轴索引"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Cgk0uRUYZiai"
      },
      "source": [
        "更高秩的张量通过传递多个索引来编制索引。\n",
        "\n",
        "对于高秩张量的每个单独的轴，遵循与单轴情形完全相同的索引规则。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Tc5X_WlsZXmd"
      },
      "outputs": [],
      "source": [
        "print(rank_2_tensor.numpy())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "w07U9vq5ipQk"
      },
      "source": [
        "为每个索引传递一个整数，结果是一个标量。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "PvILXc1PjqTM"
      },
      "outputs": [],
      "source": [
        "# Pull out a single value from a 2-rank tensor\n",
        "print(rank_2_tensor[1, 1].numpy())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3RLCzAOHjfEH"
      },
      "source": [
        "您可以使用整数与切片的任意组合编制索引："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "YTqNqsfJkJP_"
      },
      "outputs": [],
      "source": [
        "# Get row and column tensors\n",
        "print(\"Second row:\", rank_2_tensor[1, :].numpy())\n",
        "print(\"Second column:\", rank_2_tensor[:, 1].numpy())\n",
        "print(\"Last row:\", rank_2_tensor[-1, :].numpy())\n",
        "print(\"First item in last column:\", rank_2_tensor[0, -1].numpy())\n",
        "print(\"Skip the first row:\")\n",
        "print(rank_2_tensor[1:, :].numpy(), \"\\n\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "P45TwSUVSK6G"
      },
      "source": [
        "下面是一个 3 轴张量的示例："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "GuLoMoCVSLxK"
      },
      "outputs": [],
      "source": [
        "print(rank_3_tensor[:, :, 4])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9NgmHq27TJOE"
      },
      "source": [
        "<table>\n",
        "<tr>\n",
        "<th colspan=\"2\">选择批次中每个示例的所有位置的最后一个特征</th>\n",
        "</tr>\n",
        "<tr>\n",
        "    <td> <img alt=\"A 3x2x5 tensor with all the values at the index-4 of the last axis selected.\" src=\"images/tensor/index1.png\">\n",
        "</td>\n",
        "      <td> <img alt=\"The selected values packed into a 2-axis tensor.\" src=\"images/tensor/index2.png\">\n",
        "</td>\n",
        "</tr>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fpr7R0t4SVb0"
      },
      "source": [
        "## 操作形状\n",
        "\n",
        "Reshaping a tensor is of great utility.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "EMeTtga5Wq8j"
      },
      "outputs": [],
      "source": [
        "# Shape returns a `TensorShape` object that shows the size on each dimension\n",
        "var_x = tf.Variable(tf.constant([[1], [2], [3]]))\n",
        "print(var_x.shape)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "38jc2RXziT3W"
      },
      "outputs": [],
      "source": [
        "# You can convert this object into a Python list, too\n",
        "print(var_x.shape.as_list())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "J_xRlHZMKYnF"
      },
      "source": [
        "通过重构可以改变张量的形状。重构的速度很快，资源消耗很低，因为不需要复制底层数据。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "pa9JCgMLWy87"
      },
      "outputs": [],
      "source": [
        "# We can reshape a tensor to a new shape.\n",
        "# Note that we're passing in a list\n",
        "reshaped = tf.reshape(var_x, [1, 3])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Mcq7iXOkW3LK"
      },
      "outputs": [],
      "source": [
        "print(var_x.shape)\n",
        "print(reshaped.shape)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gIB2tOkoVr6E"
      },
      "source": [
        "数据在内存中的布局保持不变，同时使用请求的形状创建一个指向同一数据的新张量。TensorFlow 采用 C 样式的“行优先”内存访问顺序，即最右侧的索引值递增对应于内存中的单步位移。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "7kMfM0RpUgI8"
      },
      "outputs": [],
      "source": [
        "print(rank_3_tensor)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TcDtfQkJWzIx"
      },
      "source": [
        "如果您展平张量，则可以看到它在内存中的排列顺序。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "COnHEPuaWDQp"
      },
      "outputs": [],
      "source": [
        "# A `-1` passed in the `shape` argument says \"Whatever fits\".\n",
        "print(tf.reshape(rank_3_tensor, [-1]))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jJZRira2W--c"
      },
      "source": [
        "一般来说，`tf.reshape` 唯一合理的用途是用于合并或拆分相邻轴（或添加/移除 `1`）。\n",
        "\n",
        "对于 3x2x5 张量，重构为 (3x2)x5 或 3x(2x5) 都合理，因为切片不会混淆："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "zP2Iqc7zWu_J"
      },
      "outputs": [],
      "source": [
        "print(tf.reshape(rank_3_tensor, [3*2, 5]), \"\\n\")\n",
        "print(tf.reshape(rank_3_tensor, [3, -1]))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6ZsZRUhihlDB"
      },
      "source": [
        "<table>\n",
        "<th colspan=\"3\">一些正确的重构示例。</th>\n",
        "<tr>\n",
        "  <td> <img alt=\"A 3x2x5 tensor\" src=\"images/tensor/reshape-before.png\">\n",
        "</td>\n",
        "  <td>   <img alt=\"The same data reshaped to (3x2)x5\" src=\"images/tensor/reshape-good1.png\">\n",
        "</td>\n",
        "  <td> <img alt=\"The same data reshaped to 3x(2x5)\" src=\"images/tensor/reshape-good2.png\">\n",
        "</td>\n",
        "</tr>\n",
        "</table>\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nOcRxDC3jNIU"
      },
      "source": [
        "重构可以处理总元素个数相同的任何新形状，但是如果不遵从轴的顺序，则不会发挥任何作用。\n",
        "\n",
        "利用 `tf.reshape` 无法实现轴的交换，要交换轴，您需要使用 `tf.transpose`。\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "I9qDL_8u7cBH"
      },
      "outputs": [],
      "source": [
        "# Bad examples: don't do this\n",
        "\n",
        "# You can't reorder axes with reshape.\n",
        "print(tf.reshape(rank_3_tensor, [2, 3, 5]), \"\\n\") \n",
        "\n",
        "# This is a mess\n",
        "print(tf.reshape(rank_3_tensor, [5, 6]), \"\\n\")\n",
        "\n",
        "# This doesn't work at all\n",
        "try:\n",
        "  tf.reshape(rank_3_tensor, [7, -1])\n",
        "except Exception as e:\n",
        "  print(f\"{type(e).__name__}: {e}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qTM9-5eh68oo"
      },
      "source": [
        "<table>\n",
        "<th colspan=\"3\">一些错误的重构示例。</th>\n",
        "<tr>\n",
        "  <td> <img alt=\"You can't reorder axes, use tf.transpose for that\" src=\"images/tensor/reshape-bad.png\">\n",
        "</td>\n",
        "  <td> <img alt=\"Anything that mixes the slices of data together is probably wrong.\" src=\"images/tensor/reshape-bad4.png\">\n",
        "</td>\n",
        "  <td> <img alt=\"The new shape must fit exactly.\" src=\"images/tensor/reshape-bad2.png\">\n",
        "</td>\n",
        "</tr>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "N9r90BvHCbTt"
      },
      "source": [
        "您可能会遇到非完全指定的形状。要么是形状包含 `None` 维度（维度的长度未知），要么是形状为 `None`（张量的秩未知）。\n",
        "\n",
        "除了 [tf.RaggedTensor](#ragged_tensors) 外，这种情况只会在 TensorFlow 的符号化计算图构建 API 环境中出现：\n",
        "\n",
        "- [tf.function](function.ipynb)\n",
        "- [Keras 函数式 API](keras/functional.ipynb)。\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fDmFtFM7k0R2"
      },
      "source": [
        "## `DTypes` 详解\n",
        "\n",
        "使用 `Tensor.dtype` 属性可以检查 `tf.Tensor` 的数据类型。\n",
        "\n",
        "从 Python 对象创建 `tf.Tensor` 时，您可以选择指定数据类型。\n",
        "\n",
        "如果不指定，TensorFlow 会选择一个可以表示您的数据的数据类型。TensorFlow 将 Python 整数转换为 `tf.int32`，将 Python 浮点数转换为 `tf.float32`。另外，当转换为数组时，TensorFlow 会采用与 NumPy 相同的规则。\n",
        "\n",
        "数据类型可以相互转换。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "5mSTDWbelUvu"
      },
      "outputs": [],
      "source": [
        "the_f64_tensor = tf.constant([2.2, 3.3, 4.4], dtype=tf.float64)\n",
        "the_f16_tensor = tf.cast(the_f64_tensor, dtype=tf.float16)\n",
        "# Now, let's cast to an uint8 and lose the decimal precision\n",
        "the_u8_tensor = tf.cast(the_f16_tensor, dtype=tf.uint8)\n",
        "print(the_u8_tensor)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "s1yBlJsVlFSu"
      },
      "source": [
        "## 广播\n",
        "\n",
        "广播是从 [NumPy 中的等效功能](https://numpy.org/doc/stable/user/basics.html)借用的一个概念。简而言之，在一定条件下，对一组张量执行组合运算时，为了适应大张量，会对小张量进行“扩展”。\n",
        "\n",
        "最简单和最常见的例子是尝试将张量与标量相乘或相加。在这种情况下会对标量进行广播，使其变成与其他参数相同的形状。 "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "P8sypqmagHQN"
      },
      "outputs": [],
      "source": [
        "x = tf.constant([1, 2, 3])\n",
        "\n",
        "y = tf.constant(2)\n",
        "z = tf.constant([2, 2, 2])\n",
        "# All of these are the same computation\n",
        "print(tf.multiply(x, 2))\n",
        "print(x * y)\n",
        "print(x * z)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "o0SBoR6voWcb"
      },
      "source": [
        "同样，可以扩展大小为 1 的维度，使其符合其他参数。在同一个计算中可以同时扩展两个参数。\n",
        "\n",
        "在本例中，一个 3x1 的矩阵与一个 1x4 进行元素级乘法运算，从而产生一个 3x4 的矩阵。注意前导 1 是可选的：y 的形状是 `[4]`。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "6sGmkPg3XANr"
      },
      "outputs": [],
      "source": [
        "# These are the same computations\n",
        "x = tf.reshape(x,[3,1])\n",
        "y = tf.range(1, 5)\n",
        "print(x, \"\\n\")\n",
        "print(y, \"\\n\")\n",
        "print(tf.multiply(x, y))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "t_7sh-EUYLrE"
      },
      "source": [
        "<table>\n",
        "<tr>\n",
        "  <th>广播相加：<code>[3, 1]</code> 乘以 <code>[1, 4]</code> 的结果是 <code>[3,4]</code>\n",
        "</th>\n",
        "</tr>\n",
        "<tr>\n",
        "  <td> <img alt=\"Adding a 3x1 matrix to a 4x1 matrix results in a 3x4 matrix\" src=\"images/tensor/broadcasting.png\">\n",
        "</td>\n",
        "</tr>\n",
        "</table>\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9V3KgSJcKDRz"
      },
      "source": [
        "下面是不使用广播的同一运算："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "elrF6v63igY8"
      },
      "outputs": [],
      "source": [
        "x_stretch = tf.constant([[1, 1, 1, 1],\n",
        "                         [2, 2, 2, 2],\n",
        "                         [3, 3, 3, 3]])\n",
        "\n",
        "y_stretch = tf.constant([[1, 2, 3, 4],\n",
        "                         [1, 2, 3, 4],\n",
        "                         [1, 2, 3, 4]])\n",
        "\n",
        "print(x_stretch * y_stretch)  # Again, operator overloading"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "14KobqYu85gi"
      },
      "source": [
        "在大多数情况下，广播的时间和空间效率更高，因为广播运算不会在内存中具体化扩展的张量。\n",
        "\n",
        "使用 `tf.broadcast_to` 可以了解广播的运算方式。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "GW2Q59_r8hZ6"
      },
      "outputs": [],
      "source": [
        "print(tf.broadcast_to(tf.constant([1, 2, 3]), [3, 3]))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Z2bAMMQY-jpP"
      },
      "source": [
        "与数学运算不同，比方说，`broadcast_to` 并不会节省内存。在这个例子中，张量已经具体化。\n",
        "\n",
        "这可能会变得更复杂。Jake VanderPlas 的*《Python 数据科学手册》*一书中的[这一节](https://jakevdp.github.io/PythonDataScienceHandbook/02.05-computation-on-arrays-broadcasting.html)介绍了更多广播技巧（同样使用 NumPy）。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "o4Rpz0xAsKSI"
      },
      "source": [
        "## tf.convert_to_tensor\n",
        "\n",
        "大部分运算（如 `tf.matmul` 和 `tf.reshape`）会使用 `tf.Tensor` 类的参数。不过，在上面的示例中，您会发现我们经常传递形状类似于张量的 Python 对象。\n",
        "\n",
        "大部分（但并非全部）运算会在非张量参数上调用 `convert_to_tensor`。我们提供了一个转换注册表，大多数对象类（如 NumPy 的 `ndarray`、`TensorShape`、Python 列表和 `tf.Variable`）都可以自动转换。\n",
        "\n",
        "有关更多详细信息，请参阅 `tf.register_tensor_conversion_function`。如果您有自己的类型，则可能希望自动转换为张量。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "05bBVBVYV0y6"
      },
      "source": [
        "## 不规则张量\n",
        "\n",
        "如果张量的某个轴上的元素个数可变，则称为“不规则”张量。对于不规则数据，请使用 `tf.ragged.RaggedTensor`。\n",
        "\n",
        "例如，下面的例子无法用规则张量表示："
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VPc3jGoeJqB7"
      },
      "source": [
        "<table>\n",
        "<tr>\n",
        "  <th>“tf.RaggedTensor”，形状：<code>[4, None]</code>\n",
        "</th>\n",
        "</tr>\n",
        "<tr>\n",
        "  <td> <img alt=\"A 2-axis ragged tensor, each row can have a different length.\" src=\"images/tensor/ragged.png\">\n",
        "</td>\n",
        "</tr>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "VsbTjoFfNVBF"
      },
      "outputs": [],
      "source": [
        "ragged_list = [\n",
        "    [0, 1, 2, 3],\n",
        "    [4, 5],\n",
        "    [6, 7, 8],\n",
        "    [9]]"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "p4xKTo57tutG"
      },
      "outputs": [],
      "source": [
        "try:\n",
        "  tensor = tf.constant(ragged_list)\n",
        "except Exception as e:\n",
        "  print(f\"{type(e).__name__}: {e}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0cm9KuEeMLGI"
      },
      "source": [
        "应使用 `tf.ragged.constant` 来创建 `tf.RaggedTensor`："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "XhF3QV3jiqTj"
      },
      "outputs": [],
      "source": [
        "ragged_tensor = tf.ragged.constant(ragged_list)\n",
        "print(ragged_tensor)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "sFgHduHVNoIE"
      },
      "source": [
        "`tf.RaggedTensor` 的形状包含未知维度："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Eo_3wJUWNgqB"
      },
      "outputs": [],
      "source": [
        "print(ragged_tensor.shape)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "V9njclVkkN7G"
      },
      "source": [
        "## 字符串张量\n",
        "\n",
        "`tf.string` 是一种 `dtype`，也就是说，在张量中，我们可以用字符串（可变长度字节数组）来表示数据。\n",
        "\n",
        "字符串是原子类型，无法像 Python 字符串一样编制索引。字符串的长度并不是张量的一个维度。有关操作字符串的函数，请参阅 `tf.strings`。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5P_8spEGQ0wp"
      },
      "source": [
        "下面是一个标量字符串张量："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "sBosmM8MkIh4"
      },
      "outputs": [],
      "source": [
        "# Tensors can be strings, too here is a scalar string.\n",
        "scalar_string_tensor = tf.constant(\"Gray wolf\")\n",
        "print(scalar_string_tensor)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CMFBSl1FQ3vE"
      },
      "source": [
        "下面是一个字符串向量："
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "IO-c3Tq3RC1L"
      },
      "source": [
        "<table>\n",
        "<tr>\n",
        "  <th>字符串向量，形状：<code>[3,]</code>\n",
        "</th>\n",
        "</tr>\n",
        "<tr>\n",
        "  <td> <img alt=\"The string length is not one of the tensor's axes.\" src=\"images/tensor/strings.png\">\n",
        "</td>\n",
        "</tr>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "41Dv2kL9QrtO"
      },
      "outputs": [],
      "source": [
        "# If we have three string tensors of different lengths, this is OK.\n",
        "tensor_of_strings = tf.constant([\"Gray wolf\",\n",
        "                                 \"Quick brown fox\",\n",
        "                                 \"Lazy dog\"])\n",
        "# Note that the shape is (3,). The string length is not included.\n",
        "print(tensor_of_strings)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "76gQ9qrgSMzS"
      },
      "source": [
        "在上面的打印输出中，`b` 前缀表示 `tf.string` dtype 不是 Unicode 字符串，而是字节字符串。有关在 TensorFlow 如何使用 Unicode 文本的详细信息，请参阅 [Unicode 教程](https://tensorflow.google.cn/tutorials/load_data/unicode)。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ClSBPK-lZBQp"
      },
      "source": [
        "如果传递 Unicode 字符，则会使用 utf-8 编码。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "GTgL53jxSMd9"
      },
      "outputs": [],
      "source": [
        "tf.constant(\"🥳👍\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Ir9cY42MMAei"
      },
      "source": [
        "在 `tf.strings` 中可以找到用于操作字符串的一些基本函数，包括 `tf.strings.split`。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "8k2K0VTFyj8e"
      },
      "outputs": [],
      "source": [
        "# We can use split to split a string into a set of tensors\n",
        "print(tf.strings.split(scalar_string_tensor, sep=\" \"))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "zgGAn1dfR-04"
      },
      "outputs": [],
      "source": [
        "# ...but it turns into a `RaggedTensor` if we split up a tensor of strings,\n",
        "# as each string might be split into a different number of parts.\n",
        "print(tf.strings.split(tensor_of_strings))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HsAn1kPeO84m"
      },
      "source": [
        "<table>\n",
        "<tr>\n",
        "  <th>三个字符串分割，形状：<code>[3, None]</code>\n",
        "</th>\n",
        "</tr>\n",
        "<tr>\n",
        "  <td> <img alt=\"Splitting multiple strings returns a tf.RaggedTensor\" src=\"images/tensor/string-split.png\">\n",
        "</td>\n",
        "</tr>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "st9OxrUxWSKY"
      },
      "source": [
        "以及 `tf.string.to_number`："
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "3nRtx3X9WRfN"
      },
      "outputs": [],
      "source": [
        "text = tf.constant(\"1 10 100\")\n",
        "print(tf.strings.to_number(tf.strings.split(text, \" \")))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "r2EZtBbJBns4"
      },
      "source": [
        "虽然不能使用 `tf.cast` 将字符串张量转换为数值，但是可以先将其转换为字节，然后转换为数值。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "fo8BjmH7gyTj"
      },
      "outputs": [],
      "source": [
        "byte_strings = tf.strings.bytes_split(tf.constant(\"Duck\"))\n",
        "byte_ints = tf.io.decode_raw(tf.constant(\"Duck\"), tf.uint8)\n",
        "print(\"Byte strings:\", byte_strings)\n",
        "print(\"Bytes:\", byte_ints)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "uSQnZ7d1jCSQ"
      },
      "outputs": [],
      "source": [
        "# Or split it up as unicode and then decode it\n",
        "unicode_bytes = tf.constant(\"アヒル 🦆\")\n",
        "unicode_char_bytes = tf.strings.unicode_split(unicode_bytes, \"UTF-8\")\n",
        "unicode_values = tf.strings.unicode_decode(unicode_bytes, \"UTF-8\")\n",
        "\n",
        "print(\"\\nUnicode bytes:\", unicode_bytes)\n",
        "print(\"\\nUnicode chars:\", unicode_char_bytes)\n",
        "print(\"\\nUnicode values:\", unicode_values)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fE7nKJ2YW3aY"
      },
      "source": [
        "`tf.string` dtype 可用于 TensorFlow 中的所有原始字节数据。`tf.io` 模块包含在数据与字节类型之间进行相互转换的函数，包括解码图像和解析 csv 的函数。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ua8BnAzxkRKV"
      },
      "source": [
        "## 稀疏张量\n",
        "\n",
        "在某些情况下，数据很稀疏，比如说在一个非常宽的嵌入空间中。为了高效存储稀疏数据，TensorFlow 支持 `tf.sparse.SparseTensor` 和相关运算。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mS5zgqgUTPRb"
      },
      "source": [
        "<table>\n",
        "<tr>\n",
        "  <th>“tf.SparseTensor”，形状：<code>[3, 4]</code>\n",
        "</th>\n",
        "</tr>\n",
        "<tr>\n",
        "  <td> <img alt=\"An 3x4 grid, with values in only two of the cells.\" src=\"images/tensor/sparse.png\">\n",
        "</td>\n",
        "</tr>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "B9nbO1E2kSUN"
      },
      "outputs": [],
      "source": [
        "# Sparse tensors store values by index in a memory-efficient manner\n",
        "sparse_tensor = tf.sparse.SparseTensor(indices=[[0, 0], [1, 2]],\n",
        "                                       values=[1, 2],\n",
        "                                       dense_shape=[3, 4])\n",
        "print(sparse_tensor, \"\\n\")\n",
        "\n",
        "# We can convert sparse tensors to dense\n",
        "print(tf.sparse.to_dense(sparse_tensor))"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [
        "Tce3stUlHN0L"
      ],
      "name": "tensor.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
